A supernumerary synthetic chromosome in Komagataella phaffii as a repository for extraneous genetic material

Background Komagataella phaffii (Pichia pastoris) is a methylotrophic commercially important non-conventional species of yeast that grows in a fermentor to exceptionally high densities on simple media and secretes recombinant proteins efficiently. Genetic engineering strategies are being explored in this organism to facilitate cost-effective biomanufacturing. Small, stable artificial chromosomes in K. phaffii could offer unique advantages by accommodating multiple integrations of extraneous genes and their promoters without accumulating perturbations of native chromosomes or exhausting the availability of selection markers. Results Here, we describe a linear “nano”chromosome (of 15–25 kb) that, according to whole-genome sequencing, persists in K. phaffii over many generations with a copy number per cell of one, provided non-homologous end joining is compromised (by KU70-knockout). The nanochromosome includes a copy of the centromere from K. phaffii chromosome 3, a K. phaffii-derived autonomously replicating sequence on either side of the centromere, and a pair of K. phaffii-like telomeres. It contains, within its q arm, a landing zone in which genes of interest alternate with long (approx. 1-kb) non-coding DNA chosen to facilitate homologous recombination and serve as spacers. The landing zone can be extended along the nanochromosome, in an inch-worming mode of sequential gene integrations, accompanied by recycling of just two antibiotic-resistance markers. The nanochromosome was used to express PDI, a gene encoding protein disulfide isomerase. Co-expression with PDI allowed the production, from a genomically integrated gene, of secreted murine complement factor H, a plasma protein containing 40 disulfide bonds. As further proof-of-principle, we co-expressed, from a nanochromosome, both PDI and a gene for GFP-tagged human complement factor H under the control of PAOX1 and demonstrated that the secreted protein was active as a regulator of the complement system. Conclusions We have added K. phaffii to the list of organisms that can produce human proteins from genes carried on a stable, linear, artificial chromosome. We envisage using nanochromosomes as repositories for numerous extraneous genes, allowing intensive engineering of K. phaffii without compromising its genome or weakening the resulting strain. Supplementary Information The online version contains supplementary material available at 10.1186/s12934-023-02262-4.


Supplementary Figure S1
The route to our nanochromosome-carrying strains of K. phaffii This is a more detailed version of Figure 1A.Circles represent engineered-in-E.coli plasmids (eDAxxx etc).See also the list of plasmids in Additional file 2: Table 2 and the set of nanochromosomes drawn in Figure 1B.Boxes represents K. phaffii strains (yDAxxx etc) that contain nanochromosomes.See also the list of strains in Additional file 2: Table 3. Zeo R ex eDA40 Hyg R ex eDA37 t o e D A 1 3 1 pPicZα pUC19 pPicZα    1 2 3 4 5 1 2 3 4 5 1 2 3 4 5   ( The upper schematic shows the proto-telomere[i-SceI-recognition site]proto-telomere structure expected after annealing and then ligation of four oligos as color-coded in the lower schematic (see Additional file 2: Table 1).Grey boxes represent some of the seven-base pair telomere repeats.The PvuI-recognition site allows Tel excision from, and insertion into, plasmids.The confirmatory gel shows expected bands for the individual oligos (see schematic for numbering) and the products of annealing.a p arm, while high coverage suggests there are between to and four copies, per cell, of a majority of the nChr 1 sequence.This result is compatible with de novo assembly (details described in the Results section, sequences deposited in Additional file 3) suggesting fusion between nChr1 q arm and the ~34-kb Chr 3 p arm.A vertical arrow in the Chr 3 plot indicates data consistent with duplication of the Chr 3 p arm and its centromere.In (b), the values (>2) for normalised coverage for nChr 2 suggest multiple copies per cell of its DNA content.The high copy-number for the Chr 3 centromere correlates with a high copy-number of nChr 2 and supports the existence of chimeric multi-centric nanochromosomes.This observation is compatible with yDA175 and yDA177 de novo assembly results obtained from long-read WGS (Additional file 3).(c)-(g): Strains on a ΔKU70 background are each consistent with a single copy per cell of its nanochromosome, specifically: (c), yDA253 with a single copy of nChr 2A; (d), yDA260 with mFH gene integrated into (native) Chr 4 and a single copy of nChr 2A; (e), yDA263 with a single copy of nChr 2A.1 (i.e after replacing Hyg R in nChr 2A with GFP-Zeo R ); (f) and (g), yDA275 and yDA277 (biological replicates) single copies of nChr 2A.2 (i.e. after the inch-worming proof-of-principle experiment).All raw data are deposited at BioProject PRJNA971544.

Supplementary Figure S9
Validating the KU70-knockout K. phaffii strain employed in this study to host nanochromosomes (a) Upper box: Annotated visualisation (IGV) of WGS results confirms deletion of CG dinucleotide spanning codons for KU70 25 His-Ala 26 .The KU70 start codon is indicated with a black rectangle.Lower box: Annotated SnapGene screenshot showing the sequence of KU70 (on Chr 3) and highlighting the dinucleotide (GC) deleted in our ΔKU70 strain.(b) Serial dilution to compare viabilities of wild-type and ΔKU70 strains under various conditions.CBS7435 (WT) cells and two ΔKU70 isolates, yDA208 (ΔKU70-1) and yDA210 (ΔKU70-2) were inoculated into YPD and grown for 24 hours at 30 °C.Cultures were adjusted to A 600 = 0.1, then subjected to a series of tenfold dilutions for plating onto YPD or YPG (glycerol) agar.Plates were incubated at 30 °C for two days, or 37 °C for four days.

Supplementary Figure S13
Assessment of nChr 2A persistence in K. phaffii strain yDA260 (a) Cells grown for 12 hours in YPD (+ hygromycin) were used to inoculate YPG (no antibiotic).After 24 hours of exponential growth in YPG, cells were spun down then resuspended in YPM (no antibiotic).After 36 hours of methanol-induction, calls were streaked out on YPD plates, with or without hygromycin.An assessment of the retention of Hyg R (and, by extension, nChr 2A) was made by comparing colony counts with versus without hygromycin.The results (right-hand panels, white numerals are colony counts derived from ImageJ software) were expressed as % of colonies that were no longerhygromycin resistant (note: yDA260 carries nChr 2A; yDA264 is a control in which the Hyg R cassette is integrated into the native genome; yDA140, contains only (telomerenull/centromere-null) eDA37 (see Additional file 2: Table 5).(b) Representative genotyping by colony PCR of K. phaffii strain yDA260 recovered from YPD plate after the chromosome-loss assay.The (numbered) sites targeted by oligo-pairs (Additional File 2: Suppl.Table 2) are indicated on on the schematic of nChr 2A.
Preparation of framework plasmid eDA53 (see also Fig. 2) (a) The DNA parts (annotated by SnapGene Viewer) used for Gibson assembly.Fragment 1: eDA24 (source of CEN3) linearized with blunt-end restriction enzymes.Fragment 2: PCR-amplified Zeo R , ex eDA40.Fragment 3: PCR-amplified PARS-A76, ex eDA26 (see Additional file 2: Table 1 for oligo sequences).(b) Purified fragments 1-3 ran as expected on an agarose gel.(c) Map of eDA53 with location of primer-pairs used for verification by PCR.Of five E. coli colonies selected on plates with 50 µg/mL zeocin and 100 µg/mL ampicillin, only one colony (number 4, highlighted) was verified, and Sanger sequencing confirmed the expected sequence of the extracted plasmid.
Construction of the telomeres part (Tel) (see also Fig.2) Validation of Tel-carrying plasmid eDA131 (see also Fig. 2) (a) Map of eDA131 showing the locations of the various sequences targeted by oligos (red text and arrows) used for colony PCR-based screening of E. coli transformants, following ligation between cleaved eDA40 and Tel.(b) Among zeocin-resistant and ampicillin-sensitive strains, colony-PCR using oligos 280/282 generates bands corresponding to the expected (584-bp) amplicon in clones 1, 3 and 6, only.(c) Plasmids isolated from clones 1, 3 and 6, following digestion with FspI/PvuII and gel electrophoresis, yielded a pattern of bands close to the one predicted (by SnapGene).The Tel part could thus be digested with blunt-end restriction enzymes PvuII and FspI, then gel-extracted prior to ligation with framework-plasmid eDA53.
Insertion and integration arrays (a) Various strategies for integration via double-crossover HR of genes, delivered within an in vitro-assembled insertion array, into a K. phaffii nanochromosome-resident integration array.(b) An example of array assembly.PCR-amplified parts digested with BsmBI were gel-purified then ligated pairwise.Products were gel-purified and ligated into BsmBI-linearised pUC19.Resultant plasmids were used to transform E. coli then Sanger sequenced.In preparation for deployment, arrays were subsequently excised with AsiSI, or amplified by PCR (dx.doi.org/10.17504/protocols.io.bp2l69p95lqe/v1).Additional file 2: Table2contains a list of insertion and integration arrays.Supplementary Figure S7Whole-genome sequencing coverage data for wild-type and ΔKU70 K. phaffii cells containing nanochromosomes (This is an extended version of Fig.5) See legend of Figure5in main text for an explanation of axes and symbols.(a) and (b): For strains on wild-type background, plots of normalized coverage indicate two-four copies of nanochromosomal sequences per cell.Note in (a) that nChr 1 of yDA122 lacks